Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Wappalyzer #3800

Merged
merged 41 commits into from
Jan 10, 2025
Merged

Update Wappalyzer #3800

merged 41 commits into from
Jan 10, 2025

Conversation

ammar92
Copy link
Contributor

@ammar92 ammar92 commented Nov 6, 2024

Changes

This update replaces the outdated and archived Wappalyzer dependency with a more current fork. It also implements a script that downloads an updated technologies file from this, a project that aims to maintain Wappalyzer technologies files. For now the technology file should be updated manually from time to time, but we should work towards a plans for an automated way of doing this.

Issue link

Closes #3037

QA notes

Run the Wappalyzer normalizer and see if it still produces Software and SoftwareInstance OOIs.


Code Checklist

  • All the commits in this PR are properly PGP-signed and verified.
  • This PR only contains functionality relevant to the issue.
  • I have written unit tests for the changes or fixes I made.
  • I have checked the documentation and made changes where necessary.
  • I have performed a self-review of my code and refactored it to the best of my abilities.
  • Tickets have been created for newly discovered issues.
  • For any non-trivial functionality, I have added integration and/or end-to-end tests.
  • I have informed others of any required .env changes files if required and changed the .env-dist accordingly.
  • I have included comments in the code to elaborate on what is not self-evident from the code itself, including references to issues and discussions online, or implicit behavior of an interface.

Checklist for code reviewers:

Copy-paste the checklist from the docs/source/templates folder into your comment.


Checklist for QA:

Copy-paste the checklist from the docs/source/templates folder into your comment.

@ammar92 ammar92 requested a review from a team as a code owner November 6, 2024 07:48
@noamblitz
Copy link
Contributor

Can confirm that this still generates software and softwareinstance oois, but cannot confirm whether there are more or less of those now.

@underdarknl underdarknl added the 😸 Review/QA feedback Review/QA feedback provided label Dec 4, 2024
@stephanie0x00
Copy link
Contributor

Observed the following two things:

  1. One of the normalizers doesn't resolve pending findings, even though no normalizers have failed.

image

  1. Two boefjes failed on the initial run. Errors are pasted below. Rescheduling works.

webpageanalysis port 443

fje-1  | 2024-12-16T12:54:46.573192 [info] Saved raw file 8776c2f4-c377-45c3-bf43-9d69a945ec78 for boefje webpage-analysis[e7f42f51-1b0e-4fab-a14b-321d5cf7f2ea]
boefje-1  | HTTP Request: POST http://bytes:8000/bytes/raw?boefje_meta_id=e7f42f51-1b0e-4fab-a14b-321d5cf7f2ea "HTTP/1.1 500 Internal Server Error"
boefje-1  | HTTP Request: POST http://bytes:8000/bytes/raw?boefje_meta_id=1d2be888-9394-4487-8dfb-8dba39a0517b "HTTP/1.1 200 OK"
boefje-1  | 2024-12-16T12:54:46.634357 [error] {"detail":"Could not save raw data"}
boefje-1  | 2024-12-16T12:54:46.635861 [info] Saved raw file 6a9d1de2-dee2-4463-a6f7-4023dfd783fb for boefje webpage-analysis[1d2be888-9394-4487-8dfb-8dba39a0517b]
boefje-1  | HTTP Request: GET http://scheduler:8000/tasks/1d2be888-9394-4487-8dfb-8dba39a0517b "HTTP/1.1 200 OK"
boefje-1  | HTTP Request: PATCH http://scheduler:8000/tasks/1d2be888-9394-4487-8dfb-8dba39a0517b "HTTP/1.1 200 OK"
boefje-1  | 2024-12-16T12:54:46.643849 [info] Set status to TaskStatus.COMPLETED in the scheduler for task[id=1d2be888-9394-4487-8dfb-8dba39a0517b]
boefje-1  | HTTP Request: PATCH http://scheduler:8000/tasks/066ed423-1937-40be-9043-c08f9e30ecef "HTTP/1.1 200 OK"
boefje-1  | 2024-12-16T12:54:46.648229 [info] Handling boefje kat-finding-types[task_id=066ed423-1937-40be-9043-c08f9e30ecef]
boefje-1  | HTTP Request: GET http://octopoes_api/aa/object?reference=KATFindingType%7CKAT-NO-CERTIFICATE&valid_time=2024-12-16%2012%3A54%3A46.664604%2B00%3A00 "HTTP/1.1 200 OK"
boefje-1  | HTTP Request: GET http://katalogus:8000/v1/organisations/aa/kat-finding-types/settings "HTTP/1.1 200 OK"
boefje-1  | 2024-12-16T12:54:46.708820 [info] Starting boefje kat-finding-types[066ed423-1937-40be-9043-c08f9e30ecef]
boefje-1  | 2024-12-16T12:54:46.709440 [info] Saving to Bytes for boefje kat-finding-types[066ed423-1937-40be-9043-c08f9e30ecef]
boefje-1  | HTTP Request: POST http://bytes:8000/bytes/boefje_meta "HTTP/1.1 201 Created"
boefje-1  | HTTP Request: POST http://bytes:8000/bytes/raw?boefje_meta_id=066ed423-1937-40be-9043-c08f9e30ecef "HTTP/1.1 200 OK"
boefje-1  | 2024-12-16T12:54:46.764085 [info] Saved raw file 539b6ce4-be7a-4935-b2ca-731fd4f3214e for boefje kat-finding-types[066ed423-1937-40be-9043-c08f9e30ecef]
boefje-1  | HTTP Request: GET http://scheduler:8000/tasks/066ed423-1937-40be-9043-c08f9e30ecef "HTTP/1.1 200 OK"
boefje-1  | 2024-12-16T12:54:46.634593 [error] An error occurred handling scheduler item[id=e7f42f51-1b0e-4fab-a14b-321d5cf7f2ea]
boefje-1  | ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
boefje-1  | │ /app/boefjes/boefjes/app.py:251 in _start_working                                                │
boefje-1  | │                                                                                                  │
boefje-1  | │   248 │   │                                                                                      │
boefje-1  | │   249 │   │   try:                                                                               │
boefje-1  | │   250 │   │   │   scheduler_client.patch_task(p_item.id, TaskStatus.RUNNING)                     │
boefje-1  | │ ❱ 251 │   │   │   handler.handle(p_item.data)                                                    │
boefje-1  | │   252 │   │   │   status = TaskStatus.COMPLETED                                                  │
boefje-1  | │   253 │   │   except Exception:  # noqa                                                          │
boefje-1  | │   254 │   │   │   logger.exception("An error occurred handling scheduler item[id=%s]", p_item.   │
boefje-1  | │                                                                                                  │
boefje-1  | │ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
boefje-1  | │ │          handler = <boefjes.job_handler.BoefjeHandler object at 0x78a7cf145710>              │ │
boefje-1  | │ │   handling_tasks = <DictProxy object, typeid 'dict' at 0x78a7cf4c4fd0>                       │ │
boefje-1  | │ │           p_item = Task(                                                                     │ │
boefje-1  | │ │                    │   id=UUID('e7f42f51-1b0e-4fab-a14b-321d5cf7f2ea'),                      │ │
boefje-1  | │ │                    │   scheduler_id='boefje-aa',                                             │ │
boefje-1  | │ │                    │   schedule_id='715ecad0-bb83-4dc1-a89b-af5df1acbb83',                   │ │
boefje-1  | │ │                    │   priority=2,                                                           │ │
boefje-1  | │ │                    │   status=<TaskStatus.DISPATCHED: 'dispatched'>,                         │ │
boefje-1  | │ │                    │   type='boefje',                                                        │ │
boefje-1  | │ │                    │   hash='45535cde270079f75c71a3b2d9c478a7',                              │ │
boefje-1  | │ │                    │   data=BoefjeMeta(                                                      │ │
boefje-1  | │ │                    │   │   id=UUID('e7f42f51-1b0e-4fab-a14b-321d5cf7f2ea'),                  │ │
boefje-1  | │ │                    │   │   started_at=datetime.datetime(2024, 12, 16, 12, 54, 46, 312057,    │ │
boefje-1  | │ │                    tzinfo=datetime.timezone.utc),                                            │ │
boefje-1  | │ │                    │   │   ended_at=datetime.datetime(2024, 12, 16, 12, 54, 46, 466907,      │ │
boefje-1  | │ │                    tzinfo=datetime.timezone.utc),                                            │ │
boefje-1  | │ │                    │   │   boefje=Boefje(id='webpage-analysis', version=None),               │ │
boefje-1  | │ │                    │   │                                                                     │ │
boefje-1  | │ │                    input_ooi='HTTPResource|internet|134.209.85.72|tcp|443|https|internet|mi… │ │
boefje-1  | │ │                    │   │   arguments={                                                       │ │
boefje-1  | │ │                    │   │   │   'input': {                                                    │ │
boefje-1  | │ │                    │   │   │   │   'object_type': 'HTTPResource',                            │ │
boefje-1  | │ │                    │   │   │   │   'scan_profile': "scan_profile_type='inherited'            │ │
boefje-1  | │ │                    reference=Reference('HTTPResource|internet|134.209"+107,                  │ │
boefje-1  | │ │                    │   │   │   │   'user_id': 'None',                                        │ │
boefje-1  | │ │                    │   │   │   │   'primary_key':                                            │ │
boefje-1  | │ │                    'HTTPResource|internet|134.209.85.72|tcp|443|https|internet|mispo.es|htt… │ │
boefje-1  | │ │                    │   │   │   │   'website': {                                              │ │
boefje-1  | │ │                    │   │   │   │   │   'ip_service': PrimaryKeyToken(                        │ │
boefje-1  | │ │                    │   │   │   │   │   │   root={                                            │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   'ip_port': PrimaryKeyToken(                   │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   │   root={                                    │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   │   │   'address': PrimaryKeyToken(           │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   │   │   │   root={                            │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   │   │   │   │   'network': PrimaryKeyToken(   │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   │   │   │   │   │   root={                    │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   │   │   │   │   │   │   'name': 'internet'    │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   │   │   │   │   │   }                         │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   │   │   │   │   ),                            │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   │   │   │   │   'address': '134.209.85.72'    │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   │   │   │   }                                 │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   │   │   ),                                    │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   │   │   'protocol': 'tcp',                    │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   │   │   'port': '443'                         │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   │   }                                         │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   ),                                            │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   'service': PrimaryKeyToken(                   │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   │   root={'name': 'https'}                    │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   )                                             │ │
boefje-1  | │ │                    │   │   │   │   │   │   }                                                 │ │
boefje-1  | │ │                    │   │   │   │   │   ),                                                    │ │
boefje-1  | │ │                    │   │   │   │   │   'hostname': PrimaryKeyToken(                          │ │
boefje-1  | │ │                    │   │   │   │   │   │   root={                                            │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   'network': PrimaryKeyToken(                   │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   │   root={'name': 'internet'}                 │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   ),                                            │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   'name': 'mispo.es'                            │ │
boefje-1  | │ │                    │   │   │   │   │   │   }                                                 │ │
boefje-1  | │ │                    │   │   │   │   │   )                                                     │ │
boefje-1  | │ │                    │   │   │   │   },                                                        │ │
boefje-1  | │ │                    │   │   │   │   'web_url': {                                              │ │
boefje-1  | │ │                    │   │   │   │   │   'scheme': 'https',                                    │ │
boefje-1  | │ │                    │   │   │   │   │   'netloc': PrimaryKeyToken(                            │ │
boefje-1  | │ │                    │   │   │   │   │   │   root={                                            │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   'network': PrimaryKeyToken(                   │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   │   root={'name': 'internet'}                 │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   ),                                            │ │
boefje-1  | │ │                    │   │   │   │   │   │   │   'name': 'mispo.es'                            │ │
boefje-1  | │ │                    │   │   │   │   │   │   }                                                 │ │
boefje-1  | │ │                    │   │   │   │   │   ),                                                    │ │
boefje-1  | │ │                    │   │   │   │   │   'port': '443',                                        │ │
boefje-1  | │ │                    │   │   │   │   │   'path': '/'                                           │ │
boefje-1  | │ │                    │   │   │   │   },                                                        │ │
boefje-1  | │ │                    │   │   │   │   'redirects_to': 'None'                                    │ │
boefje-1  | │ │                    │   │   │   }                                                             │ │
boefje-1  | │ │                    │   │   },                                                                │ │
boefje-1  | │ │                    │   │   organization='aa',                                                │ │
boefje-1  | │ │                    │   │                                                                     │ │
boefje-1  | │ │                    runnable_hash='740793eea70ddc8ad9a9659ff22c8befc673081e8d3dbb5cab6151669… │ │
boefje-1  | │ │                    │   │   environment={}                                                    │ │
boefje-1  | │ │                    │   ),                                                                    │ │
boefje-1  | │ │                    │   created_at=datetime.datetime(2024, 12, 16, 12, 54, 40, 109781,        │ │
boefje-1  | │ │                    tzinfo=TzInfo(UTC)),                                                      │ │
boefje-1  | │ │                    │   modified_at=datetime.datetime(2024, 12, 16, 12, 54, 40, 109785,       │ │
boefje-1  | │ │                    tzinfo=TzInfo(UTC))                                                       │ │
boefje-1  | │ │                    )                                                                         │ │
boefje-1  | │ │ scheduler_client = <boefjes.clients.scheduler_client.SchedulerAPIClient object at            │ │
boefje-1  | │ │                    0x78a7cf4aed10>                                                           │ │
boefje-1  | │ │           status = <TaskStatus.FAILED: 'failed'>                                             │ │
boefje-1  | │ │       task_queue = <AutoProxy[Queue] object, typeid 'Queue' at 0x78a7cf4c6310>               │ │
boefje-1  | │ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
boefje-1  | │                                                                                                  │
boefje-1  | │ /app/boefjes/boefjes/job_handler.py:168 in handle                                                │
boefje-1  | │                                                                                                  │
boefje-1  | │   165 │   │   │   │   │   │   │   )                                                              │
boefje-1  | │   166 │   │   │   │   │   │   else:                                                              │
boefje-1  | │   167 │   │   │   │   │   │   │   valid_mimetypes.add(mimetype)                                  │
boefje-1  | │ ❱ 168 │   │   │   │   │   raw_file_id = self.bytes_client.save_raw(                              │
boefje-1  | │   169 │   │   │   │   │   │   boefje_meta.id, output, _default_mime_types(boefje_meta.boefje).   │
boefje-1  | │   170 │   │   │   │   │   )                                                                      │
boefje-1  | │   171 │   │   │   │   │   logger.info(                                                           │
boefje-1  | │                                                                                                  │
boefje-1  | │ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
boefje-1  | │ │ boefje_added_mime_types = {'text/html', 'openkat-http/body'}                                 │ │
boefje-1  | │ │             boefje_meta = BoefjeMeta(                                                        │ │
boefje-1  | │ │                           │   id=UUID('e7f42f51-1b0e-4fab-a14b-321d5cf7f2ea'),               │ │
boefje-1  | │ │                           │   started_at=datetime.datetime(2024, 12, 16, 12, 54, 46, 312057, │ │
boefje-1  | │ │                           tzinfo=datetime.timezone.utc),                                     │ │
boefje-1  | │ │                           │   ended_at=datetime.datetime(2024, 12, 16, 12, 54, 46, 466907,   │ │
boefje-1  | │ │                           tzinfo=datetime.timezone.utc),                                     │ │
boefje-1  | │ │                           │   boefje=Boefje(id='webpage-analysis', version=None),            │ │
boefje-1  | │ │                           │                                                                  │ │
boefje-1  | │ │                           input_ooi='HTTPResource|internet|134.209.85.72|tcp|443|https|inte… │ │
boefje-1  | │ │                           │   arguments={                                                    │ │
boefje-1  | │ │                           │   │   'input': {                                                 │ │
boefje-1  | │ │                           │   │   │   'object_type': 'HTTPResource',                         │ │
boefje-1  | │ │                           │   │   │   'scan_profile': "scan_profile_type='inherited'         │ │
boefje-1  | │ │                           reference=Reference('HTTPResource|internet|134.209"+107,           │ │
boefje-1  | │ │                           │   │   │   'user_id': 'None',                                     │ │
boefje-1  | │ │                           │   │   │   'primary_key':                                         │ │
boefje-1  | │ │                           'HTTPResource|internet|134.209.85.72|tcp|443|https|internet|mispo… │ │
boefje-1  | │ │                           │   │   │   'website': {                                           │ │
boefje-1  | │ │                           │   │   │   │   'ip_service': PrimaryKeyToken(                     │ │
boefje-1  | │ │                           │   │   │   │   │   root={                                         │ │
boefje-1  | │ │                           │   │   │   │   │   │   'ip_port': PrimaryKeyToken(                │ │
boefje-1  | │ │                           │   │   │   │   │   │   │   root={                                 │ │
boefje-1  | │ │                           │   │   │   │   │   │   │   │   'address': PrimaryKeyToken(        │ │
boefje-1  | │ │                           │   │   │   │   │   │   │   │   │   root={                         │ │
boefje-1  | │ │                           │   │   │   │   │   │   │   │   │   │   'network':                 │ │
boefje-1  | │ │                           PrimaryKeyToken(                                                   │ │
boefje-1  | │ │                           │   │   │   │   │   │   │   │   │   │   │   root={                 │ │
boefje-1  | │ │                           │   │   │   │   │   │   │   │   │   │   │   │   'name': 'internet' │ │
boefje-1  | │ │                           │   │   │   │   │   │   │   │   │   │   │   }                      │ │
boefje-1  | │ │                           │   │   │   │   │   │   │   │   │   │   ),                         │ │
boefje-1  | │ │                           │   │   │   │   │   │   │   │   │   │   'address': '134.209.85.72' │ │
boefje-1  | │ │                           │   │   │   │   │   │   │   │   │   }                              │ │
boefje-1  | │ │                           │   │   │   │   │   │   │   │   ),                                 │ │
boefje-1  | │ │                           │   │   │   │   │   │   │   │   'protocol': 'tcp',                 │ │
boefje-1  | │ │                           │   │   │   │   │   │   │   │   'port': '443'                      │ │
boefje-1  | │ │                           │   │   │   │   │   │   │   }                                      │ │
boefje-1  | │ │                           │   │   │   │   │   │   ),                                         │ │
boefje-1  | │ │                           │   │   │   │   │   │   'service': PrimaryKeyToken(                │ │
boefje-1  | │ │                           │   │   │   │   │   │   │   root={'name': 'https'}                 │ │
boefje-1  | │ │                           │   │   │   │   │   │   )                                          │ │
boefje-1  | │ │                           │   │   │   │   │   }                                              │ │
boefje-1  | │ │                           │   │   │   │   ),                                                 │ │
boefje-1  | │ │                           │   │   │   │   'hostname': PrimaryKeyToken(                       │ │
boefje-1  | │ │                           │   │   │   │   │   root={                                         │ │
boefje-1  | │ │                           │   │   │   │   │   │   'network': PrimaryKeyToken(                │ │
boefje-1  | │ │                           │   │   │   │   │   │   │   root={'name': 'internet'}              │ │
boefje-1  | │ │                           │   │   │   │   │   │   ),                                         │ │
boefje-1  | │ │                           │   │   │   │   │   │   'name': 'mispo.es'                         │ │
boefje-1  | │ │                           │   │   │   │   │   }                                              │ │
boefje-1  | │ │                           │   │   │   │   )                                                  │ │
boefje-1  | │ │                           │   │   │   },                                                     │ │
boefje-1  | │ │                           │   │   │   'web_url': {                                           │ │
boefje-1  | │ │                           │   │   │   │   'scheme': 'https',                                 │ │
boefje-1  | │ │                           │   │   │   │   'netloc': PrimaryKeyToken(                         │ │
boefje-1  | │ │                           │   │   │   │   │   root={                                         │ │
boefje-1  | │ │                           │   │   │   │   │   │   'network': PrimaryKeyToken(                │ │
boefje-1  | │ │                           │   │   │   │   │   │   │   root={'name': 'internet'}              │ │
boefje-1  | │ │                           │   │   │   │   │   │   ),                                         │ │
boefje-1  | │ │                           │   │   │   │   │   │   'name': 'mispo.es'                         │ │
boefje-1  | │ │                           │   │   │   │   │   }                                              │ │
boefje-1  | │ │                           │   │   │   │   ),                                                 │ │
boefje-1  | │ │                           │   │   │   │   'port': '443',                                     │ │
boefje-1  | │ │                           │   │   │   │   'path': '/'                                        │ │
boefje-1  | │ │                           │   │   │   },                                                     │ │
boefje-1  | │ │                           │   │   │   'redirects_to': 'None'                                 │ │
boefje-1  | │ │                           │   │   }                                                          │ │
boefje-1  | │ │                           │   },                                                             │ │
boefje-1  | │ │                           │   organization='aa',                                             │ │
boefje-1  | │ │                           │                                                                  │ │
boefje-1  | │ │                           runnable_hash='740793eea70ddc8ad9a9659ff22c8befc673081e8d3dbb5cab… │ │
boefje-1  | │ │                           │   environment={}                                                 │ │
boefje-1  | │ │                           )                                                                  │ │
boefje-1  | │ │          boefje_results = [                                                                  │ │
boefje-1  | │ │                           │   (                                                              │ │
boefje-1  | │ │                           │   │   {'application/json+har'},                                  │ │
boefje-1  | │ │                           │   │   b'{"log": {"version": "1.2", "creator": {"name":           │ │
boefje-1  | │ │                           "kat_webpage_analysis", "version"'+3454                            │ │
boefje-1  | │ │                           │   ),                                                             │ │
boefje-1  | │ │                           │   (                                                              │ │
boefje-1  | │ │                           │   │   {'openkat-http/headers'},                                  │ │
boefje-1  | │ │                           │   │   '{"Server": "nginx/1.18.0", "Date": "Mon, 16 Dec 2024      │ │
boefje-1  | │ │                           12:54:46 GMT", "Content-Typ'+415                                   │ │
boefje-1  | │ │                           │   ),                                                             │ │
boefje-1  | │ │                           │   (                                                              │ │
boefje-1  | │ │                           │   │   {'text/html', 'openkat-http/body'},                        │ │
boefje-1  | │ │                           │   │   b'<!DOCTYPE html>\n<html lang="en">\n<head>\n  <meta       │ │
boefje-1  | │ │                           charset="UTF-8">\n  <meta http-eq'+1759                            │ │
boefje-1  | │ │                           │   )                                                              │ │
boefje-1  | │ │                           ]                                                                  │ │
boefje-1  | │ │                mimetype = 'openkat-http/body'                                                │ │
boefje-1  | │ │                     ooi = HTTPResource(                                                      │ │
boefje-1  | │ │                           │   object_type='HTTPResource',                                    │ │
boefje-1  | │ │                           │   scan_profile=InheritedScanProfile(                             │ │
boefje-1  | │ │                           │   │   scan_profile_type='inherited',                             │ │
boefje-1  | │ │                           │   │                                                              │ │
boefje-1  | │ │                           reference='HTTPResource|internet|134.209.85.72|tcp|443|https|inte… │ │
boefje-1  | │ │                           │   │   level=<ScanLevel.L4: 4>,                                   │ │
boefje-1  | │ │                           │   │   user_id=None                                               │ │
boefje-1  | │ │                           │   ),                                                             │ │
boefje-1  | │ │                           │   user_id=None,                                                  │ │
boefje-1  | │ │                           │                                                                  │ │
boefje-1  | │ │                           primary_key='HTTPResource|internet|134.209.85.72|tcp|443|https|in… │ │
boefje-1  | │ │                           │                                                                  │ │
boefje-1  | │ │                           website=Reference('Website|internet|134.209.85.72|tcp|443|https|i… │ │
boefje-1  | │ │                           │                                                                  │ │
boefje-1  | │ │                           web_url=Reference('HostnameHTTPURL|https|internet|mispo.es|443|/'… │ │
boefje-1  | │ │                           │   redirects_to=None                                              │ │
boefje-1  | │ │                           )                                                                  │ │
boefje-1  | │ │                  output = b'<!DOCTYPE html>\n<html lang="en">\n<head>\n  <meta               │ │
boefje-1  | │ │                           charset="UTF-8">\n  <meta http-eq'+1759                            │ │
boefje-1  | │ │                  plugin = Boefje(                                                            │ │
boefje-1  | │ │                           │   id='webpage-analysis',                                         │ │
boefje-1  | │ │                           │   name='WebpageAnalysis',                                        │ │
boefje-1  | │ │                           │   version=None,                                                  │ │
boefje-1  | │ │                           │   created=None,                                                  │ │
boefje-1  | │ │                           │   description='Downloads a resource and uses several different   │ │
boefje-1  | │ │                           normalizers to analyze',                                           │ │
boefje-1  | │ │                           │   enabled=True,                                                  │ │
boefje-1  | │ │                           │   static=True,                                                   │ │
boefje-1  | │ │                           │   type='boefje',                                                 │ │
boefje-1  | │ │                           │   scan_level=2,                                                  │ │
boefje-1  | │ │                           │   consumes={'HTTPResource'},                                     │ │
boefje-1  | │ │                           │   produces={                                                     │ │
boefje-1  | │ │                           │   │   'audio/x-pn-realaudio',                                    │ │
boefje-1  | │ │                           │   │   'application/postscript',                                  │ │
boefje-1  | │ │                           │   │   'application/x-gtar',                                      │ │
boefje-1  | │ │                           │   │   'application/x-bcpio',                                     │ │
boefje-1  | │ │                           │   │   'application/pdf',                                         │ │
boefje-1  | │ │                           │   │   'video/x-sgi-movie',                                       │ │
boefje-1  | │ │                           │   │   'application/json',                                        │ │
boefje-1  | │ │                           │   │   'application/octet-stream',                                │ │
boefje-1  | │ │                           │   │   'text/x-vcard',                                            │ │
boefje-1  | │ │                           │   │   'application/x-troff-me',                                  │ │
boefje-1  | │ │                           │   │   ... +77                                                    │ │
boefje-1  | │ │                           │   },                                                             │ │
boefje-1  | │ │                           │   boefje_schema=None,                                            │ │
boefje-1  | │ │                           │   cron=None,                                                     │ │
boefje-1  | │ │                           │   interval=None,                                                 │ │
boefje-1  | │ │                           │                                                                  │ │
boefje-1  | │ │                           runnable_hash='740793eea70ddc8ad9a9659ff22c8befc673081e8d3dbb5cab… │ │
boefje-1  | │ │                           │   oci_image=None,                                                │ │
boefje-1  | │ │                           │   oci_arguments=[]                                               │ │
boefje-1  | │ │                           )                                                                  │ │
boefje-1  | │ │             raw_file_id = UUID('8776c2f4-c377-45c3-bf43-9d69a945ec78')                       │ │
boefje-1  | │ │               reference = 'HTTPResource|internet|134.209.85.72|tcp|443|https|internet|mispo… │ │
boefje-1  | │ │                    self = <boefjes.job_handler.BoefjeHandler object at 0x78a7cf145710>       │ │
boefje-1  | │ │         valid_mimetypes = {'text/html', 'openkat-http/body'}                                 │ │
boefje-1  | │ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
boefje-1  | │                                                                                                  │
boefje-1  | │ /app/boefjes/boefjes/clients/bytes_client.py:25 in wrapper                                       │
boefje-1  | │                                                                                                  │
boefje-1  | │    22 │   @wraps(function)                                                                       │
boefje-1  | │    23 │   def wrapper(self, *args, **kwargs):                                                    │
boefje-1  | │    24 │   │   try:                                                                               │
boefje-1  | │ ❱  25 │   │   │   return function(self, *args, **kwargs)                                         │
boefje-1  | │    26 │   │   except HTTPStatusError as error:                                                   │
boefje-1  | │    27 │   │   │   if error.response.status_code != 401:                                          │
boefje-1  | │    28 │   │   │   │   raise                                                                      │
boefje-1  | │                                                                                                  │
boefje-1  | │ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
boefje-1  | │ │     args = (                                                                                 │ │
boefje-1  | │ │            │   UUID('e7f42f51-1b0e-4fab-a14b-321d5cf7f2ea'),                                 │ │
boefje-1  | │ │            │   b'<!DOCTYPE html>\n<html lang="en">\n<head>\n  <meta charset="UTF-8">\n       │ │
boefje-1  | │ │            <meta http-eq'+1759,                                                              │ │
boefje-1  | │ │            │   {'boefje/webpage-analysis', 'openkat-http/body', 'text/html'}                 │ │
boefje-1  | │ │            )                                                                                 │ │
boefje-1  | │ │ function = <function BytesAPIClient.save_raw at 0x78a7cfd19da0>                              │ │
boefje-1  | │ │   kwargs = {}                                                                                │ │
boefje-1  | │ │     self = <boefjes.clients.bytes_client.BytesAPIClient object at 0x78a7cfd20090>            │ │
boefje-1  | │ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
boefje-1  | │                                                                                                  │
boefje-1  | │ /app/boefjes/boefjes/clients/bytes_client.py:118 in save_raw                                     │
boefje-1  | │                                                                                                  │
boefje-1  | │   115 │   │   │   headers=self.headers,                                                          │
boefje-1  | │   116 │   │   │   params={"boefje_meta_id": str(boefje_meta_id)},                                │
boefje-1  | │   117 │   │   )                                                                                  │
boefje-1  | │ ❱ 118 │   │   self._verify_response(response)                                                    │
boefje-1  | │   119 │   │                                                                                      │
boefje-1  | │   120 │   │   return UUID(response.json()[file_name])                                            │
boefje-1  | │   121                                                                                            │
boefje-1  | │                                                                                                  │
boefje-1  | │ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
boefje-1  | │ │ boefje_meta_id = UUID('e7f42f51-1b0e-4fab-a14b-321d5cf7f2ea')                                │ │
boefje-1  | │ │      file_name = 'raw'                                                                       │ │
boefje-1  | │ │     mime_types = {'boefje/webpage-analysis', 'openkat-http/body', 'text/html'}               │ │
boefje-1  | │ │            raw = b'<!DOCTYPE html>\n<html lang="en">\n<head>\n  <meta charset="UTF-8">\n     │ │
boefje-1  | │ │                  <meta http-eq'+1759                                                         │ │
boefje-1  | │ │       response = <Response [500 Internal Server Error]>                                      │ │
boefje-1  | │ │           self = <boefjes.clients.bytes_client.BytesAPIClient object at 0x78a7cfd20090>      │ │
boefje-1  | │ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
boefje-1  | │                                                                                                  │
boefje-1  | │ /app/boefjes/boefjes/clients/bytes_client.py:54 in _verify_response                              │
boefje-1  | │                                                                                                  │
boefje-1  | │    51 │   @staticmethod                                                                          │
boefje-1  | │    52 │   def _verify_response(response: Response) -> None:                                      │
boefje-1  | │    53 │   │   try:                                                                               │
boefje-1  | │ ❱  54 │   │   │   response.raise_for_status()                                                    │
boefje-1  | │    55 │   │   except HTTPStatusError as error:                                                   │
boefje-1  | │    56 │   │   │   if error.response.status_code != 401:                                          │
boefje-1  | │    57 │   │   │   │   logger.error(response.text)                                                │
boefje-1  | │                                                                                                  │
boefje-1  | │ ╭───────────────────── locals ──────────────────────╮                                            │
boefje-1  | │ │ response = <Response [500 Internal Server Error]> │                                            │
boefje-1  | │ ╰───────────────────────────────────────────────────╯                                            │
boefje-1  | │                                                                                                  │
boefje-1  | │ /usr/local/lib/python3.11/site-packages/httpx/_models.py:763 in raise_for_status                 │
boefje-1  | │                                                                                                  │
boefje-1  | │    760 │   │   }                                                                                 │
boefje-1  | │    761 │   │   error_type = error_types.get(status_class, "Invalid status code")                 │
boefje-1  | │    762 │   │   message = message.format(self, error_type=error_type)                             │
boefje-1  | │ ❱  763 │   │   raise HTTPStatusError(message, request=request, response=self)                    │
boefje-1  | │    764 │                                                                                         │
boefje-1  | │    765 │   def json(self, **kwargs: typing.Any) -> typing.Any:                                   │
boefje-1  | │    766 │   │   return jsonlib.loads(self.content, **kwargs)                                      │
boefje-1  | │                                                                                                  │
boefje-1  | │ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
boefje-1  | │ │   error_type = 'Server error'                                                                │ │
boefje-1  | │ │  error_types = {                                                                             │ │
boefje-1  | │ │                │   1: 'Informational response',                                              │ │
boefje-1  | │ │                │   3: 'Redirect response',                                                   │ │
boefje-1  | │ │                │   4: 'Client error',                                                        │ │
boefje-1  | │ │                │   5: 'Server error'                                                         │ │
boefje-1  | │ │                }                                                                             │ │
boefje-1  | │ │      message = "Server error '500 Internal Server Error' for url                             │ │
boefje-1  | │ │                'http://bytes:8000/bytes/raw?bo"+139                                          │ │
boefje-1  | │ │      request = <Request('POST',                                                              │ │
boefje-1  | │ │                'http://bytes:8000/bytes/raw?boefje_meta_id=e7f42f51-1b0e-4fab-a14b-321d5cf7… │ │
boefje-1  | │ │         self = <Response [500 Internal Server Error]>                                        │ │
boefje-1  | │ │ status_class = 5                                                                             │ │
boefje-1  | │ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
boefje-1  | ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
boefje-1  | HTTPStatusError: Server error '500 Internal Server Error' for url 
boefje-1  | 'http://bytes:8000/bytes/raw?boefje_meta_id=e7f42f51-1b0e-4fab-a14b-321d5cf7f2ea'
boefje-1  | For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500
boefje-1  | Traceback (most recent call last):
boefje-1  |   File "/app/boefjes/boefjes/app.py", line 251, in _start_working
boefje-1  |     handler.handle(p_item.data)
boefje-1  |   File "/app/boefjes/boefjes/job_handler.py", line 168, in handle
boefje-1  |     raw_file_id = self.bytes_client.save_raw(
boefje-1  |                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^
boefje-1  |   File "/app/boefjes/boefjes/clients/bytes_client.py", line 25, in wrapper
boefje-1  |     return function(self, *args, **kwargs)
boefje-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
boefje-1  |   File "/app/boefjes/boefjes/clients/bytes_client.py", line 118, in save_raw
boefje-1  |     self._verify_response(response)
boefje-1  |   File "/app/boefjes/boefjes/clients/bytes_client.py", line 54, in _verify_response
boefje-1  |     response.raise_for_status()
boefje-1  |   File "/usr/local/lib/python3.11/site-packages/httpx/_models.py", line 763, in raise_for_status
boefje-1  |     raise HTTPStatusError(message, request=request, response=self)
boefje-1  | httpx.HTTPStatusError: Server error '500 Internal Server Error' for url 'http://bytes:8000/bytes/raw?boefje_meta_id=e7f42f51-1b0e-4fab-a14b-321d5cf7f2ea'
boefje-1  | For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500
boefje-1  | HTTP Request: PATCH http://scheduler:8000/tasks/066ed423-1937-40be-9043-c08f9e30ecef "HTTP/1.1 200 OK"
boefje-1  | 2024-12-16T12:54:46.775390 [info] Set status to TaskStatus.COMPLETED in the scheduler for task[id=066ed423-1937-40be-9043-c08f9e30ecef]
boefje-1  | HTTP Request: GET http://scheduler:8000/tasks/e7f42f51-1b0e-4fab-a14b-321d5cf7f2ea "HTTP/1.1 200 OK"
boefje-1  | HTTP Request: PATCH http://scheduler:8000/tasks/e7f42f51-1b0e-4fab-a14b-321d5cf7f2ea "HTTP/1.1 200 OK"

RetireJS port 443

boefje-1  | HTTP Request: PATCH http://scheduler:8000/tasks/9bec42fc-c0cb-4379-a72a-18cdbd1e4dd4 "HTTP/1.1 200 OK"
boefje-1  | 2024-12-16T12:55:47.963360 [info] Handling boefje cve-finding-types[task_id=9bec42fc-c0cb-4379-a72a-18cddd1e4dd4]
boefje-1  | HTTP Request: GET http://octopoes_api/aa/object?reference=CVEFindingType%7CCVE-2015-9251&valid_time=2024-12-16%2012%3A55%3A47.983710%2B00%3A00 "HTTP/1.1 200 OK"
boefje-1  | 2024-12-16T12:55:47.945798 [error] An error occurred handling scheduler item[id=9e7860b9-e15f-4b6d-b5a2-5dd8a3936a40]
boefje-1  | ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
boefje-1  | │ /app/boefjes/boefjes/app.py:251 in _start_working                                                │
boefje-1  | │                                                                                                  │
boefje-1  | │   248 │   │                                                                                      │
boefje-1  | │   249 │   │   try:                                                                               │
boefje-1  | │   250 │   │   │   scheduler_client.patch_task(p_item.id, TaskStatus.RUNNING)                     │
boefje-1  | │ ❱ 251 │   │   │   handler.handle(p_item.data)                                                    │
boefje-1  | │   252 │   │   │   status = TaskStatus.COMPLETED                                                  │
boefje-1  | │   253 │   │   except Exception:  # noqa                                                          │
boefje-1  | │   254 │   │   │   logger.exception("An error occurred handling scheduler item[id=%s]", p_item.   │
boefje-1  | │                                                                                                  │
boefje-1  | │ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
boefje-1  | │ │          handler = <boefjes.job_handler.BoefjeHandler object at 0x78a7cf145710>              │ │
boefje-1  | │ │   handling_tasks = <DictProxy object, typeid 'dict' at 0x78a7cf4c4fd0>                       │ │
boefje-1  | │ │           p_item = Task(                                                                     │ │
boefje-1  | │ │                    │   id=UUID('9e7860b9-e15f-4b6d-b5a2-5dd8a3936a40'),                      │ │
boefje-1  | │ │                    │   scheduler_id='boefje-aa',                                             │ │
boefje-1  | │ │                    │   schedule_id='2fcc4bbf-10fe-4b0b-96e5-be5e6cb91c02',                   │ │
boefje-1  | │ │                    │   priority=2,                                                           │ │
boefje-1  | │ │                    │   status=<TaskStatus.DISPATCHED: 'dispatched'>,                         │ │
boefje-1  | │ │                    │   type='boefje',                                                        │ │
boefje-1  | │ │                    │   hash='a87e06ce2d09987f4ac56ab17d131940',                              │ │
boefje-1  | │ │                    │   data=BoefjeMeta(                                                      │ │
boefje-1  | │ │                    │   │   id=UUID('9e7860b9-e15f-4b6d-b5a2-5dd8a3936a40'),                  │ │
boefje-1  | │ │                    │   │   started_at=datetime.datetime(2024, 12, 16, 12, 55, 47, 590878,    │ │
boefje-1  | │ │                    tzinfo=datetime.timezone.utc),                                            │ │
boefje-1  | │ │                    │   │   ended_at=datetime.datetime(2024, 12, 16, 12, 55, 47, 873810,      │ │
boefje-1  | │ │                    tzinfo=datetime.timezone.utc),                                            │ │
boefje-1  | │ │                    │   │   boefje=Boefje(id='retirejs-finding-types', version=None),         │ │
boefje-1  | │ │                    │   │   input_ooi='RetireJSFindingType|RetireJS-jquerymigrate-f3a3',      │ │
boefje-1  | │ │                    │   │   arguments={                                                       │ │
boefje-1  | │ │                    │   │   │   'input': {                                                    │ │
boefje-1  | │ │                    │   │   │   │   'object_type': 'RetireJSFindingType',                     │ │
boefje-1  | │ │                    │   │   │   │   'scan_profile': "scan_profile_type='empty'                │ │
boefje-1  | │ │                    reference=Reference('RetireJSFindingType|RetireJS-jque"+53,               │ │
boefje-1  | │ │                    │   │   │   │   'user_id': 'None',                                        │ │
boefje-1  | │ │                    │   │   │   │   'primary_key':                                            │ │
boefje-1  | │ │                    'RetireJSFindingType|RetireJS-jquerymigrate-f3a3',                        │ │
boefje-1  | │ │                    │   │   │   │   'id': 'RetireJS-jquerymigrate-f3a3',                      │ │
boefje-1  | │ │                    │   │   │   │   'description': 'None',                                    │ │
boefje-1  | │ │                    │   │   │   │   'source': 'None',                                         │ │
boefje-1  | │ │                    │   │   │   │   'impact': 'None',                                         │ │
boefje-1  | │ │                    │   │   │   │   'recommendation': 'None',                                 │ │
boefje-1  | │ │                    │   │   │   │   'risk_score': 0.0,                                        │ │
boefje-1  | │ │                    │   │   │   │   ... +1                                                    │ │
boefje-1  | │ │                    │   │   │   }                                                             │ │
boefje-1  | │ │                    │   │   },                                                                │ │
boefje-1  | │ │                    │   │   organization='aa',                                                │ │
boefje-1  | │ │                    │   │                                                                     │ │
boefje-1  | │ │                    runnable_hash='d8327f705f9adbafc4d992fcb4ccf372b68a5156e1b2910ed1cd99231… │ │
boefje-1  | │ │                    │   │   environment={}                                                    │ │
boefje-1  | │ │                    │   ),                                                                    │ │
boefje-1  | │ │                    │   created_at=datetime.datetime(2024, 12, 16, 12, 55, 41, 292474,        │ │
boefje-1  | │ │                    tzinfo=TzInfo(UTC)),                                                      │ │
boefje-1  | │ │                    │   modified_at=datetime.datetime(2024, 12, 16, 12, 55, 41, 292475,       │ │
boefje-1  | │ │                    tzinfo=TzInfo(UTC))                                                       │ │
boefje-1  | │ │                    )                                                                         │ │
boefje-1  | │ │ scheduler_client = <boefjes.clients.scheduler_client.SchedulerAPIClient object at            │ │
boefje-1  | │ │                    0x78a7cf4aed10>                                                           │ │
boefje-1  | │ │           status = <TaskStatus.FAILED: 'failed'>                                             │ │
boefje-1  | │ │       task_queue = <AutoProxy[Queue] object, typeid 'Queue' at 0x78a7cf4c6310>               │ │
boefje-1  | │ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
boefje-1  | │                                                                                                  │
boefje-1  | │ /app/boefjes/boefjes/job_handler.py:168 in handle                                                │
boefje-1  | │                                                                                                  │
boefje-1  | │   165 │   │   │   │   │   │   │   )                                                              │
boefje-1  | │   166 │   │   │   │   │   │   else:                                                              │
boefje-1  | │   167 │   │   │   │   │   │   │   valid_mimetypes.add(mimetype)                                  │
boefje-1  | │ ❱ 168 │   │   │   │   │   raw_file_id = self.bytes_client.save_raw(                              │
boefje-1  | │   169 │   │   │   │   │   │   boefje_meta.id, output, _default_mime_types(boefje_meta.boefje).   │
boefje-1  | │   170 │   │   │   │   │   )                                                                      │
boefje-1  | │   171 │   │   │   │   │   logger.info(                                                           │
boefje-1  | │                                                                                                  │
boefje-1  | │ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
boefje-1  | │ │ boefje_added_mime_types = set()                                                              │ │
boefje-1  | │ │             boefje_meta = BoefjeMeta(                                                        │ │
boefje-1  | │ │                           │   id=UUID('9e7860b9-e15f-4b6d-b5a2-5dd8a3936a40'),               │ │
boefje-1  | │ │                           │   started_at=datetime.datetime(2024, 12, 16, 12, 55, 47, 590878, │ │
boefje-1  | │ │                           tzinfo=datetime.timezone.utc),                                     │ │
boefje-1  | │ │                           │   ended_at=datetime.datetime(2024, 12, 16, 12, 55, 47, 873810,   │ │
boefje-1  | │ │                           tzinfo=datetime.timezone.utc),                                     │ │
boefje-1  | │ │                           │   boefje=Boefje(id='retirejs-finding-types', version=None),      │ │
boefje-1  | │ │                           │   input_ooi='RetireJSFindingType|RetireJS-jquerymigrate-f3a3',   │ │
boefje-1  | │ │                           │   arguments={                                                    │ │
boefje-1  | │ │                           │   │   'input': {                                                 │ │
boefje-1  | │ │                           │   │   │   'object_type': 'RetireJSFindingType',                  │ │
boefje-1  | │ │                           │   │   │   'scan_profile': "scan_profile_type='empty'             │ │
boefje-1  | │ │                           reference=Reference('RetireJSFindingType|RetireJS-jque"+53,        │ │
boefje-1  | │ │                           │   │   │   'user_id': 'None',                                     │ │
boefje-1  | │ │                           │   │   │   'primary_key':                                         │ │
boefje-1  | │ │                           'RetireJSFindingType|RetireJS-jquerymigrate-f3a3',                 │ │
boefje-1  | │ │                           │   │   │   'id': 'RetireJS-jquerymigrate-f3a3',                   │ │
boefje-1  | │ │                           │   │   │   'description': 'None',                                 │ │
boefje-1  | │ │                           │   │   │   'source': 'None',                                      │ │
boefje-1  | │ │                           │   │   │   'impact': 'None',                                      │ │
boefje-1  | │ │                           │   │   │   'recommendation': 'None',                              │ │
boefje-1  | │ │                           │   │   │   'risk_score': 0.0,                                     │ │
boefje-1  | │ │                           │   │   │   ... +1                                                 │ │
boefje-1  | │ │                           │   │   }                                                          │ │
boefje-1  | │ │                           │   },                                                             │ │
boefje-1  | │ │                           │   organization='aa',                                             │ │
boefje-1  | │ │                           │                                                                  │ │
boefje-1  | │ │                           runnable_hash='d8327f705f9adbafc4d992fcb4ccf372b68a5156e1b2910ed1… │ │
boefje-1  | │ │                           │   environment={}                                                 │ │
boefje-1  | │ │                           )                                                                  │ │
boefje-1  | │ │          boefje_results = [                                                                  │ │
boefje-1  | │ │                           │   (                                                              │ │
boefje-1  | │ │                           │   │   set(),                                                     │ │
boefje-1  | │ │                           │   │   b'{\n\t"retire-example": {\n\t\t"vulnerabilities" :        │ │
boefje-1  | │ │                           [\n\t\t\t{\n\t\t\t\t"below" : "0.0.2",\n\t\t\t\t"'+90370           │ │
boefje-1  | │ │                           │   )                                                              │ │
boefje-1  | │ │                           ]                                                                  │ │
boefje-1  | │ │                     ooi = RetireJSFindingType(                                               │ │
boefje-1  | │ │                           │   object_type='RetireJSFindingType',                             │ │
boefje-1  | │ │                           │   scan_profile=EmptyScanProfile(                                 │ │
boefje-1  | │ │                           │   │   scan_profile_type='empty',                                 │ │
boefje-1  | │ │                           │   │                                                              │ │
boefje-1  | │ │                           reference=Reference('RetireJSFindingType|RetireJS-jquerymigrate-f… │ │
boefje-1  | │ │                           │   │   level=<ScanLevel.L0: 0>,                                   │ │
boefje-1  | │ │                           │   │   user_id=None                                               │ │
boefje-1  | │ │                           │   ),                                                             │ │
boefje-1  | │ │                           │   user_id=None,                                                  │ │
boefje-1  | │ │                           │   primary_key='RetireJSFindingType|RetireJS-jquerymigrate-f3a3', │ │
boefje-1  | │ │                           │   id='RetireJS-jquerymigrate-f3a3',                              │ │
boefje-1  | │ │                           │   description=None,                                              │ │
boefje-1  | │ │                           │   source=None,                                                   │ │
boefje-1  | │ │                           │   impact=None,                                                   │ │
boefje-1  | │ │                           │   recommendation=None,                                           │ │
boefje-1  | │ │                           │   risk_score=0.0,                                                │ │
boefje-1  | │ │                           │   risk_severity=<RiskLevelSeverity.PENDING: 'pending'>           │ │
boefje-1  | │ │                           )                                                                  │ │
boefje-1  | │ │                  output = b'{\n\t"retire-example": {\n\t\t"vulnerabilities" :                │ │
boefje-1  | │ │                           [\n\t\t\t{\n\t\t\t\t"below" : "0.0.2",\n\t\t\t\t"'+90370           │ │
boefje-1  | │ │                  plugin = Boefje(                                                            │ │
boefje-1  | │ │                           │   id='retirejs-finding-types',                                   │ │
boefje-1  | │ │                           │   name='RetireJS Finding Types',                                 │ │
boefje-1  | │ │                           │   version=None,                                                  │ │
boefje-1  | │ │                           │   created=None,                                                  │ │
boefje-1  | │ │                           │   description='Hydrate information of RetireJS finding types.',  │ │
boefje-1  | │ │                           │   enabled=True,                                                  │ │
boefje-1  | │ │                           │   static=True,                                                   │ │
boefje-1  | │ │                           │   type='boefje',                                                 │ │
boefje-1  | │ │                           │   scan_level=0,                                                  │ │
boefje-1  | │ │                           │   consumes={'RetireJSFindingType'},                              │ │
boefje-1  | │ │                           │   produces={'boefje/retirejs-finding-types'},                    │ │
boefje-1  | │ │                           │   boefje_schema=None,                                            │ │
boefje-1  | │ │                           │   cron=None,                                                     │ │
boefje-1  | │ │                           │   interval=None,                                                 │ │
boefje-1  | │ │                           │                                                                  │ │
boefje-1  | │ │                           runnable_hash='d8327f705f9adbafc4d992fcb4ccf372b68a5156e1b2910ed1… │ │
boefje-1  | │ │                           │   oci_image=None,                                                │ │
boefje-1  | │ │                           │   oci_arguments=[]                                               │ │
boefje-1  | │ │                           )                                                                  │ │
boefje-1  | │ │               reference = Reference('RetireJSFindingType|RetireJS-jquerymigrate-f3a3')       │ │
boefje-1  | │ │                    self = <boefjes.job_handler.BoefjeHandler object at 0x78a7cf145710>       │ │
boefje-1  | │ │         valid_mimetypes = set()                                                              │ │
boefje-1  | │ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
boefje-1  | │                                                                                                  │
boefje-1  | │ /app/boefjes/boefjes/clients/bytes_client.py:25 in wrapper                                       │
boefje-1  | │                                                                                                  │
boefje-1  | │    22 │   @wraps(function)                                                                       │
boefje-1  | │    23 │   def wrapper(self, *args, **kwargs):                                                    │
boefje-1  | │    24 │   │   try:                                                                               │
boefje-1  | │ ❱  25 │   │   │   return function(self, *args, **kwargs)                                         │
boefje-1  | │    26 │   │   except HTTPStatusError as error:                                                   │
boefje-1  | │    27 │   │   │   if error.response.status_code != 401:                                          │
boefje-1  | │    28 │   │   │   │   raise                                                                      │
boefje-1  | │                                                                                                  │
boefje-1  | │ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
boefje-1  | │ │     args = (                                                                                 │ │
boefje-1  | │ │            │   UUID('9e7860b9-e15f-4b6d-b5a2-5dd8a3936a40'),                                 │ │
boefje-1  | │ │            │   b'{\n\t"retire-example": {\n\t\t"vulnerabilities" :                           │ │
boefje-1  | │ │            [\n\t\t\t{\n\t\t\t\t"below" : "0.0.2",\n\t\t\t\t"'+90370,                         │ │
boefje-1  | │ │            │   {'boefje/retirejs-finding-types'}                                             │ │
boefje-1  | │ │            )                                                                                 │ │
boefje-1  | │ │ function = <function BytesAPIClient.save_raw at 0x78a7cfd19da0>                              │ │
boefje-1  | │ │   kwargs = {}                                                                                │ │
boefje-1  | │ │     self = <boefjes.clients.bytes_client.BytesAPIClient object at 0x78a7cfd20090>            │ │
boefje-1  | │ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
boefje-1  | │                                                                                                  │
boefje-1  | │ /app/boefjes/boefjes/clients/bytes_client.py:118 in save_raw                                     │
boefje-1  | │                                                                                                  │
boefje-1  | │   115 │   │   │   headers=self.headers,                                                          │
boefje-1  | │   116 │   │   │   params={"boefje_meta_id": str(boefje_meta_id)},                                │
boefje-1  | │   117 │   │   )                                                                                  │
boefje-1  | │ ❱ 118 │   │   self._verify_response(response)                                                    │
boefje-1  | │   119 │   │                                                                                      │
boefje-1  | │   120 │   │   return UUID(response.json()[file_name])                                            │
boefje-1  | │   121                                                                                            │
boefje-1  | │                                                                                                  │
boefje-1  | │ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
boefje-1  | │ │ boefje_meta_id = UUID('9e7860b9-e15f-4b6d-b5a2-5dd8a3936a40')                                │ │
boefje-1  | │ │      file_name = 'raw'                                                                       │ │
boefje-1  | │ │     mime_types = {'boefje/retirejs-finding-types'}                                           │ │
boefje-1  | │ │            raw = b'{\n\t"retire-example": {\n\t\t"vulnerabilities" :                         │ │
boefje-1  | │ │                  [\n\t\t\t{\n\t\t\t\t"below" : "0.0.2",\n\t\t\t\t"'+90370                    │ │
boefje-1  | │ │       response = <Response [500 Internal Server Error]>                                      │ │
boefje-1  | │ │           self = <boefjes.clients.bytes_client.BytesAPIClient object at 0x78a7cfd20090>      │ │
boefje-1  | │ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
boefje-1  | │                                                                                                  │
boefje-1  | │ /app/boefjes/boefjes/clients/bytes_client.py:54 in _verify_response                              │
boefje-1  | │                                                                                                  │
boefje-1  | │    51 │   @staticmethod                                                                          │
boefje-1  | │    52 │   def _verify_response(response: Response) -> None:                                      │
boefje-1  | │    53 │   │   try:                                                                               │
boefje-1  | │ ❱  54 │   │   │   response.raise_for_status()                                                    │
boefje-1  | │    55 │   │   except HTTPStatusError as error:                                                   │
boefje-1  | │    56 │   │   │   if error.response.status_code != 401:                                          │
boefje-1  | │    57 │   │   │   │   logger.error(response.text)                                                │
boefje-1  | │                                                                                                  │
boefje-1  | │ ╭───────────────────── locals ──────────────────────╮                                            │
boefje-1  | │ │ response = <Response [500 Internal Server Error]> │                                            │
boefje-1  | │ ╰───────────────────────────────────────────────────╯                                            │
boefje-1  | │                                                                                                  │
boefje-1  | │ /usr/local/lib/python3.11/site-packages/httpx/_models.py:763 in raise_for_status                 │
boefje-1  | │                                                                                                  │
boefje-1  | │    760 │   │   }                                                                                 │
boefje-1  | │    761 │   │   error_type = error_types.get(status_class, "Invalid status code")                 │
boefje-1  | │    762 │   │   message = message.format(self, error_type=error_type)                             │
boefje-1  | │ ❱  763 │   │   raise HTTPStatusError(message, request=request, response=self)                    │
boefje-1  | │    764 │                                                                                         │
boefje-1  | │    765 │   def json(self, **kwargs: typing.Any) -> typing.Any:                                   │
boefje-1  | │    766 │   │   return jsonlib.loads(self.content, **kwargs)                                      │
boefje-1  | │                                                                                                  │
boefje-1  | │ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
boefje-1  | │ │   error_type = 'Server error'                                                                │ │
boefje-1  | │ │  error_types = {                                                                             │ │
boefje-1  | │ │                │   1: 'Informational response',                                              │ │
boefje-1  | │ │                │   3: 'Redirect response',                                                   │ │
boefje-1  | │ │                │   4: 'Client error',                                                        │ │
boefje-1  | │ │                │   5: 'Server error'                                                         │ │
boefje-1  | │ │                }                                                                             │ │
boefje-1  | │ │      message = "Server error '500 Internal Server Error' for url                             │ │
boefje-1  | │ │                'http://bytes:8000/bytes/raw?bo"+139                                          │ │
boefje-1  | │ │      request = <Request('POST',                                                              │ │
boefje-1  | │ │                'http://bytes:8000/bytes/raw?boefje_meta_id=9e7860b9-e15f-4b6d-b5a2-5dd8a393… │ │
boefje-1  | │ │         self = <Response [500 Internal Server Error]>                                        │ │
boefje-1  | │ │ status_class = 5                                                                             │ │
boefje-1  | │ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
boefje-1  | ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
boefje-1  | HTTPStatusError: Server error '500 Internal Server Error' for url 
boefje-1  | 'http://bytes:8000/bytes/raw?boefje_meta_id=9e7860b9-e15f-4b6d-b5a2-5dd8a3936a40'
boefje-1  | For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500
boefje-1  | Traceback (most recent call last):
boefje-1  |   File "/app/boefjes/boefjes/app.py", line 251, in _start_working
boefje-1  |     handler.handle(p_item.data)
boefje-1  |   File "/app/boefjes/boefjes/job_handler.py", line 168, in handle
boefje-1  |     raw_file_id = self.bytes_client.save_raw(
boefje-1  |                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^
boefje-1  |   File "/app/boefjes/boefjes/clients/bytes_client.py", line 25, in wrapper
boefje-1  |     return function(self, *args, **kwargs)
boefje-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
boefje-1  |   File "/app/boefjes/boefjes/clients/bytes_client.py", line 118, in save_raw
boefje-1  |     self._verify_response(response)
boefje-1  |   File "/app/boefjes/boefjes/clients/bytes_client.py", line 54, in _verify_response
boefje-1  |     response.raise_for_status()
boefje-1  |   File "/usr/local/lib/python3.11/site-packages/httpx/_models.py", line 763, in raise_for_status
boefje-1  |     raise HTTPStatusError(message, request=request, response=self)
boefje-1  | httpx.HTTPStatusError: Server error '500 Internal Server Error' for url 'http://bytes:8000/bytes/raw?boefje_meta_id=9e7860b9-e15f-4b6d-b5a2-5dd8a3936a40'
boefje-1  | For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500
boefje-1  | HTTP Request: GET http://scheduler:8000/tasks/9e7860b9-e15f-4b6d-b5a2-5dd8a3936a40 "HTTP/1.1 200 OK"
boefje-1  | HTTP Request: PATCH http://scheduler:8000/tasks/9e7860b9-e15f-4b6d-b5a2-5dd8a3936a40 "HTTP/1.1 200 OK"
boefje-1  | 2024-12-16T12:55:48.073788 [info] Set status to TaskStatus.FAILED in the scheduler for task[id=9e7860b9-e15f-4b6d-b5a2-5dd8a3936a40]
boefje-1  | HTTP Request: PATCH http://scheduler:8000/tasks/7a9e600a-0583-47d3-ba50-8780745aab54 "HTTP/1.1 200 OK"
boefje-1  | 2024-12-16T12:55:48.083481 [info] Handling boefje cve-finding-types[task_id=7a9e600a-0583-47d3-ba50-8780745aab54]

@underdarknl
Copy link
Contributor

Observed the following two things:

The pending findings should trigger Boefje jobs for those findings against the various hydration boefjes, not sure why they have not ran, but unlikely to be related to this PR.

It looks like Bytes was unable to store the raw files for both the boefjes runs. Not sure why, If rescheduling worked, Bytes might have had issues with reaching one of its underlying services (rabbitmq or postgres?), this would also be unlikely to be related to this PR.

ammar92 and others added 8 commits December 19, 2024 11:02
@dekkers
Copy link
Contributor

dekkers commented Jan 3, 2025

This PR removes the only python package that is installed from git, so the workaround for pip to make this work is no longer necessary. Because grep returns a non-zero exit status if there is no match the Debian packaging results in an error. In the dockerfile this isn't a problem because we directly pipe to pip so the grep exit status is ignored, but we should also remove it there because it is no longer necessary.

We should also check if the new technology files are shipped in the Debian package.

@dekkers dekkers removed their assignment Jan 7, 2025
@stephanie0x00
Copy link
Contributor

stephanie0x00 commented Jan 9, 2025

Checklist for QA:

  • I have checked out this branch, and successfully ran a fresh make reset.
  • I confirmed that there are no unintended functional regressions in this branch:
    • I have managed to pass the onboarding flow
    • Objects and Findings are created properly
    • Tasks are created and completed properly
  • I confirmed that the PR's advertised feature or hotfix works as intended.
  • I checked the logs for errors and/or warnings and made issues where necessary

What works:

The normalizer runs and checks for software on the main page. @ammar92 added inline script tags, and updated the technology files and it also works with the current version of the technology files, it now also produces HAR files.

The bugs below are known, but as this PR is a big improvement from what is currently on main I suggest we merge this, as maintaining this is quite annoying. :)

What doesn't work:

See below.

Bug or feature?:

The current version on main (and this PR) both do not check for HTTP resources on pages (e.g. /js/core.js is not analysed and picked up, while this is where often software (versions) can be found. The very old version of wappalyzer did analyse these files, but did not create proof. This will be picked up in a new PR, as it also requires some discussion on how pages are crawled/analysed and how external resources are handled. #4020

Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
67.2% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

@underdarknl underdarknl merged commit 20ac859 into main Jan 10, 2025
21 of 22 checks passed
@underdarknl underdarknl deleted the fix/update-wappalyzer branch January 10, 2025 10:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
😸 Review/QA feedback Review/QA feedback provided
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wappalyzer boefje detects less software instances than before
5 participants